Goto

Collaborating Authors

 noise component



Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attack

Neural Information Processing Systems

A powerful category of (invisible) data poisoning attacks modify a subset of training examples by small adversarial perturbations to change the prediction of certain test-time data. Existing defense mechanisms are not desirable to deploy in practice, as they ofteneither drastically harm the generalization performance, or are attack-specific, and prohibitively slow to apply. Here, we propose a simple but highly effective approach that unlike existing methods breaks various types of invisible poisoning attacks with the slightest drop in the generalization performance. We make the key observation that attacks introduce local sharp regions of high training loss, which when minimized, results in learning the adversarial perturbations and makes the attack successful. To break poisoning attacks, our key idea is to alleviate the sharp loss regions introduced by poisons. To do so, our approach comprises two components: an optimized friendly noise that is generated to maximally perturb examples without degrading the performance, and a randomly varying noise component. The combination of both components builds a very light-weight but extremely effective defense against the most powerful triggerless targeted and hidden-trigger backdoor poisoning attacks, including Gradient Matching, Bulls-eye Polytope, and Sleeper Agent. We show that our friendly noise is transferable to other architectures, and adaptive attacks cannot break our defense due to its random noise component.






A Proofs of Main Results

Neural Information Processing Systems

(conclusion 1). (conclusion 2). Z contains and only contains exogenous noises w.r.t. " means source and " Based on Theorem 6, we can readily give proof to Theorem 2. Note that in our setting where " is equivalent to " Theorem 7 (Trek-separation for directed graphical models, Theorem 2.8 in [ We now show that Theorem 2 can also be proved by trek-separation theorem: Proof of Theorem 2 (another version). 's noise components that is not shared in Therefore, the direction between X and Y is unidentifiable. GIN( Z, Y) must hold, with solution ω .